A Low-Complexity Spectro-Temporal Distortion Measure for Audio Intelligibility Improvement

نویسنده

  • CH. KRISHNA
چکیده

A speech pre-processing algorithm is presented to improve the speech intelligibility in noise for the near-end listener. The algorithm improves the intelligibility by optimally redistributing the speech energy over time and frequency for a perceptual distortion measure, which is based on a spectro-temporal auditory model. In contrast to spectral-only models, short-time information is taken into account. As a consequence, the algorithm is more sensitive to transient regions, which will therefore receive more amplification compared to stationary vowels. It is known from literature that changing the vowel-transient energy ratio is beneficial for improving speech intelligibility in noise. Objective intelligibility prediction results show that the proposed method has higher speech intelligibility in noise compared to other reference methods, without modifying the global speech energy. The Kalman filter has been the subject of extensive research and application, particularly in the area of autonomous or assisted navigation. The Kalman filter is a mathematical power tool that is playing an increasingly important role in computer graphics as we include sensing of the real world in our systems. , electrical engineers are picturing the Kalman filter as a design tool for speech, whereby it can estimate and resolve errors that are contained in speech after passing through a distorted channel. Due to this motivating fact, there are many ways a Kalman filter can be tuned to suit engineering applications such as network telephony and even satellite phone conferencing.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Predicting speech intelligibility in conditions with nonlinearly processed noisy speech

The speech-based envelope power spectrum model (sEPSM; [1]) was proposed in order to overcome the limitations of the classical speech transmission index (STI) and speech intelligibility index (SII). The sEPSM applies the signal-tonoise ratio in the envelope domain (SNRenv), which was demonstrated to successfully predict speech intelligibility in conditions with nonlinearly processed noisy speec...

متن کامل

Speech energy redistribution for intelligibility improvement in noise based on a perceptual distortion measure

A speech pre-processing algorithm is presented that improves the speech intelligibility in noise for the near-end listener. The algorithm improves intelligibility by optimally redistributing the speech energy over time and frequency according to a perceptual distortion measure, which is based on a spectro-temporal auditory model. Since this auditory model takes into account shorttime informatio...

متن کامل

Phoneme Classification Using Temporal Tracking of Speech Clusters in Spectro-temporal Domain

This article presents a new feature extraction technique based on the temporal tracking of clusters in spectro-temporal features space. In the proposed method, auditory cortical outputs were clustered. The attributes of speech clusters were extracted as secondary features. However, the shape and position of speech clusters change during the time. The clusters temporally tracked and temporal tra...

متن کامل

A spectro-temporal modulation index (STMI) for assessment of speech intelligibility

We present a biologically motivated method for assessing the intelligibility of speech recorded or transmitted under various types of distortions. The method employs an auditory model to analyze the effects of noise, reverberations, and other distortions on the joint spectro-temporal modulations present in speech, and on the ability of a channel to transmit these modulations. The effects are su...

متن کامل

Modeling spectro-temporal modulation perception in normal-hearing listeners

The ability of human listeners to detect and discriminate spectro-temporal ripples in sound has been shown to be correlated with speech intelligibility performance in several conditions. Thus, if a model would be able to account for the spectro-temporal processing limits in the auditory system, such a framework could be used to analyze the auditory processes contributing to and limiting speech ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013